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INTRODUCTION 

This paper presents an algorithm for computing a pseudo- 
inverse for any linear operator, A : R n -*■ , where R n 

and R m are Hilbert spaces of degree N and M , respec- 
tively. Problems involving linear operators that are "ill- 
conditioned" or "difficult to invert" are being encountered 
in ever increasing numbers. Computer programs using standard 
inversion techniques often provide no solution to these prob- 
lems and no information indicating an alternate approach to 
the problem. 

The algorithm to be discussed provides a method for 
minimizing | |Ax - b| | for a fixed vector b in R m . The 
inner products for R and R are not restricted to the 
usual inner product for Euclidean complex space. The nota- 
tion for the inner product of two vectors x and y will 
be (x,y) and the only norm used will be | |x| | = ^(x,x) . 

Although the inner product on R^ may be different from the 
one on R , there will be no special notation to differen- 
tiate these inner products since the space to which the 
vectors x and y belong will dictate the inner product to 
be used. Computer programs for minimizing | | Ax - b|| for 
any norm other than the Euclidean norm are essentially 
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nonexistent at MSC. It should be pointed out that standard 
inversion procedures are more desirable for inversion of 
square matrices, and sometimes for solving the least squares 
problem, provided that one is inverting "nicely -behaved" 
operators. 

AN ALGORITHM FOR PSEUDOINVERSION 

Let A ; R n R m be a linear operator. Let 
e i ,e 2 ,***,e n denote unit orthogonal vectors in R n and the 
vectors A(e^) in R m will be denoted by b^ . The sub- 
space of R^ spanned by the vectors b^b ,»**b n will be 
referred to as R(A) or the range of the operator A . 

If y is an element of R , then there exists a 
' m ’ 

unique vector x belonging to the range of A such that 


I ly - *| 


inf [ | y - m| 
meR(A) 


CD 


The purpose of this paper is to characterize all 
vectors, a = (oij » a 2 » * ' * » a n ) » such that 


n n 

■ X) °i b i * S °i A(e i 

i=l i=l 


) = A(a) 


( 2 ) 
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and to provide an algorithm for calculating some of these 
vectors . 


If an orthogonal basis for R(A) is known, then a 
representation for x can be obtained by 


x = P (y) 


(5) 


where P : R m -*• R(A) is the orthogonal projection operator 
onto the range of A . Hence the collection of all vectors, 
a , satisfying (1) are precisely those such that 


P (y) - A(cO 


(4) 


Using the Gram-Schmidt process, a set of vectors, 
2 i » z 2 » * * * ,z n * which span R(A) can be represented in 
terms of the b^ as follows. Let z x = b t and for 
j = 2 , 3 , • • • ,n . Let 


j-1 


z. - b. 


rE 


(b,. 


1 = 


I l*il I 


I! z. 

2 X 


(5) 


wh 


ere the sum is taken over the such that ||z^|| \ 0 . 


Define Q = {z^ : ||z^|| ^ 0} . Then Q is a collec- 
tion of orthogonal vectors which span R(A) and the 
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projection operator can be defined by 


p(y) 


(y» ^i) 

= WH*iii 2 

1 1 2 i 1 1 4 o 


( 6 ) 


If a set of vectors, c i> c 2 , *** ,c n * can defined 
such that they span R n and A(c^) = z^ , then any vector, 
a * (Oj ,a 2 , •• • ,o n ) , such that 


for those z^ such that 


(y, 



| | z^ | | 4 0 satisfies 


( 7 ) 



(y. z.) 

= 7 — r Z. + 0 

I I z ± I I 1 

1 1 z i 1 1 4 o 


- p(y) 


( 8 ) 
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Moreover, this will completely characterize all vectors, 
c , such that A(c) « P(y) . Suppose there is a vector, c , 
such that A(c) = P(y) ; then there is a unique set of scalars 


(VV*° n ) SUC ^ that ^ 



(since the vectors 


i=l 

c^ are linearly independent). The vectors, z^ e Q , are 
orthogonal and span R(A) , thus the coefficients of the 
z^ are such that 


a i z i = P (y) z. e Q (9) 

i = l 

are unique. Thus, 

a i = (y, 2 / I I 2 ± I I 2 ( 10 ) 

for every z^ in Q . 

If the vectors, c i» c 2>'**» c n * can defined, then 
the collection of all vectors 8 such that A(B) - P(y) is 
given by 



( 11 ) 


the satisfy (10) above ! . 
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The vectors, c i» c 2»**’> c n * are defined by the 
following recurrence relation. 

Initialize c t = e t then for j = 2,3,***,n let 


c j ' e 


. ( V hi c 

J friiUiii 2 1 

IUill+0 


(U) 


It remains to be shown that A(c^) = , ' and that the 

vectors c^ are linearly independent. First, note that 
A(Cj) = A(ej) = bj = Zj . Then, for j = 2,3,***,n 


Atcp • Atep - 


j ' 1 Cb. . z . ) 

i » 6 i J 




i=l 1 

ll* A l 1+0 


z . 

1 


2 


(13) 


j-1 

At^, - b - £ 


(b ■; » z i) 


(14) 


IUill+0 
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Thus 


A(Cj) * Zj for j * 2,3,***,n 

One can quickly observe that the vectors c^ are 
linearly independent by expanding (12) , and noting that if 

y ^ a i c^ = 0 , | 0 , then there exists scalars 

1 = 1 

d t ,d 2 , • • • ,d k _i such that 




* e k = 0 


which is a contradiction. 


Consider now the operator, A , defined as follows. 
For any q 4 ® in R(A) , q has a unique representation ’ 
as a linear combination of the vectors in Q . If 

q 

define 



*(,) 
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and 

A(0) - 0 

If H(A) is used to denote the subspace of R n 
spanned by the vectors c^ such that z^. 4 © * then 

A 

A : R(A) H (A) is a one-to-one onto map. Moreover, for 

any x in H(A) , AA(x) = x , and for any q in R(A) , 

A A A 

AA(q) * q . We cannot say that AA = AA since their 
domains are different. We cannot say that the matrix 

/S A 

representation of either AA or AA is the identity map 

since AA : R p -*■ H(A) is the identity only on H(A) and 

AA : R(A) -*■ R m , but R(A) C R m may not be generated by 

the e. in R„ . 
i m 

The following operator does provide a convenient 
representation for a solution to (1). For y e R^ , define 
A + (y) = A P(y) , then a * A + (y) is a solution to (1). 

This follows immediately since P(y) belongs to the range 
of A . Thus, A(a) = AA P(y) = P(y) . If N (A) denotes 
the space spanned by the c^ such that * 0 , then 
E = (g = A + (y) + u : y e N(A)} is the collection of all 
vectors satisfying (2). , 

For those who are familiar with the generalized 
inverse, it should be pointed out that A + satisfies the 
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Penrose equations, A + AA + = A + and AA + A = A , but not 
the other Penrose equations. Moreover, A + y satisfies 
(1) , but it is not necessarily the vector having minimal 
norm which satisfies this equation. 

^APPLICATIONS TO MATRIX INVERSION 

For computational purposes, the following modification 
of this algorithm is more practical. Choose a tolerance e 
such that if | |x| | < e for any vector x , x = 0 . 
Normalize all vectors b^ and let 

<*i » VUbJI (15) 

Obtain orthogonal unit vectors z by the following recur- 
rence relation. 

. • ^ A 

Initialize z x = *= z x = . For j = 2,3, • • • ,n 

iet 

j-1 

z i ’ ' L (d r *i ' 16 > 

i = l 

I I z i I 140 
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Use Wilkinson's modified Gram-Schmidt 


j-1 


z . 
1 


Z j 


E a a 


(17) 


i = l 


*i.H+o 


If ||Zj|| < e , set z. ~ 0 . Otherwise, set 


Zj - Zj/| |Zj I I 


Notice that e is used as a relative error tolerance 
since all vectors are normalized. 

All other calculations are carried out as before and 
we have Cj = e x and for j = 2,3,***,n 


j-1 


hu e 3 ' (d j’ Z i } C i 




i = l 


(18) 


|zj I I 


A 

If the vectors z^ are stored by columns in a matrix 
ZH and the c^ are stored in a matrix ZI by columns. 
Then 


ZI (ZH T ) 


(19) 
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and 

P (y) = ZH(ZH T y) (20) 

For square nonsingular matrices, A + = A 1 . 

Moreover, it would be advisable to use standard matrix 
inversion algorithms for matrices that are not ill-conditioned. 

For matrices that are ill-conditioned, this algorithm 
provides the following advantages: 

1. The reason for the error return "ill-conditioned 
matrix" is accompanied by reason as to why A 1 is 
difficult to compute. For example, if 

| | z 5 | | = 10" 6 , then there exists scalars 

8 1 ,g 2 »83»g«, such that 

gj b i + b 5 = e (21) 

where 

llell < 10' 6 I |b s | | 

2. Although the matrix is ill-conditioned, a solution 
x is computed such that 5 = | |Ax - b | | + 0 is a 
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minimum, where x = A + y . Even though 
6 = | | Ax - b | | $ 0 , the value for 6 is often 
quite acceptable. 


3. The range of A is characterized by unit orthogonal 
vectors. Using these, the value for inf j |Ax - y|| 
can be computed for any number of values for y 
without calculating A + , that is 

6 = | |ZH(ZH T y) - y | | • 

4. The null space, N(A) , is characterized by 
linearly independent vectors. 


If the computation of A + y is the primary objective, 
then the error check for equation (17) could be augmented 

r 

or replaced by the alternate check 


if 


(7, Zj) 


Y l(y ’ z i ) 

i = l 


< e set z 1 . 8 0 
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APPLICATION TO "LEAST SQUARES" PROBLEMS 

The problem of minimizing | | Aq - y|| , where 

A = {a^} = fj(x^) f° r = 1,2, ••*,m and j = l,2,***,n , 
occurs frequently in applied mathematics. If 
| | Aq - y I | 2 = (Aq - y)* (Aq - y) then this problem is 
usually solved using orthogonal polynomials, or in the case 
of general functions, by the solution to the normal equations 

q = (A* A) ’ 1 A* y 

provided A* A is nonsingular. 

Polynomial approximations are insufficient for solving 
a large class of problems. For this class of problems and 
for polynomial problems where nonconsecutive coefficients 
must be zero, the algorithm described in this paper has the 
following advantages: 

1. The matrix A* A is often "ill-conditioned" when 
A is not. The matrix A* A is never used, thus 
avoiding excessive rounding error in some insta.ces. 

A solution is always obtained even when A is not 
of full rank, that is, when A* A is singular. 


2. 
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3. If k significant decimal places are desired in the 

_ V 

solution, then set e = 10 and all functions which 
individually do not affect the first k significant 
digits of the solution will be discarded. This will 
permit more accurate orthogonalization and conse- 
quently a more accurate estimation of the coeffi- 
cients for the more significant functions. This 
also allows more functions to be used for fitting 
and is sometimes the deciding factor in obtaining 
an acceptable value for | | Aq - y|| . 

4. The value for min||Aq - y|| can be computed with- 
out computing the solution q = A + y . Thus, if 
the value is not acceptable, no time need be used 
to compute A + , that is, 

min| | Aq - y| | = | |ZH ZH T y - y| | . 

5. The process described in 3 and 4 can be carried out 

sequentially with respect to the functions f^ . 

That is to say that after J columns of the ZH 

matrix have been computed, if 6 = ||ZH ZH y - y | | 

is sufficiently small, then the solution, 

+ T 

A = ZI(ZH ) , can be computed using only that 
portion of the matrices available and setting 
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= 0 for i > J . If 6 is not acceptable, then 
the process can be continued until the number of 
functions is exhausted or an acceptable value is 
obtained. 

It should be pointed out, as in the case of square 
matrices, that standard procedures for solving the least 
squares problem require less computer time and fewer storage 
locations than the algorithm discussed. In this sense, the 
standard procedures are much more desirable provided an 
acceptable solution is obtained. 

CONCLUSION 

The fact that other algorithms are often more desirable 
must be emphasized strongly. There are highly recommended 
methods based upon extremely well planned numerical analysis 
using less computer storage and less execution time. The 
Gram-Schmidt orthogonalization is known to be an unstable 
process and the accuracy of the process has not been opti- 
mized in the procedure presented in this paper. This algo- 
rithm does have the advantage that all quantities computed 
have an obviou c relation to the original operator. This per- 
mits one to determine the reasons for the difficulty in 
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inverting an operator or in computing A + . For "least 
squares" or minimum no^m type problems, the method has the 
advantages that 1) solutions can be computed sequentially 
with respect to the approximating functions, 2) the norm of 
the error vector can be determined without computing A + 
and 3) functions affecting the solution in "insignificant" 
decimal places can be automatically discarded. Problems 
requiring only two or three significant decimal places occur 
quite often in problems related to spaceflight. Discarding 
functions in this manner can also improve the behavior of 
the solution at points between the values of the independent 
variable used for the regression analysis. 


